Advanced Connection Parameters

Prequisites

In the quickstart guide, we covered the basic utilities required to run a python function on a remote machine using a URL.

The previous tutorials covered machine definitions using BaseComputer.

As this is a subclass of URL, you can further enhance your workflows with the extra functionalities provided.

Creating a Connection

Assuming we have a machine which we can ssh into without issues, lets create a URL:

[2]:
from remotemanager import URL

# we will use a localhost connection to ensure compatibility for this tutorial
connection = URL(host='localhost')

We now have the concept of a “connection” to a remote machine. Though here we are using a simple “localhost” connection, remember that URL is able (and intended to) connect outside of your current workstation by specifying a connection to that machine.

A simple rule-of-thumb is to use URL(<remote>) where <remote> is whatever you would use for a ssh <remote> ... command.

See the relevant Quickstart section for more info.

Tip

You can quickly check your connection any time by issuing a command on the machine with connection.cmd('...'). pwd and/or ls would also likely alert you to if you’re connected to the right machine or not.

Testing your Connection

Added in version 0.5.10.

url also provides a test_connection() method which will attempt to connect to the remote and run a test suite. The results of this test are a strong indicator of whether remotemanager can run jobs on your machine.

[3]:
connection.test_connection()
Checking for entry point... Success (/home/ljbeal/Work/Devel/remotemanager/docs/source/tutorials)
Checking file creation in home... True
Checking file creation in /tmp... True
Checking file creation in /scratch... False
Testing remotemanager.transport.rsync:
        send... Transferring 3 Files... Done
True
        pull... Transferring 1 File... Done
True
Testing remotemanager.transport.scp:
        send... Transferring 3 Files... Done
True
        pull... Transferring 1 File... Done
True
Cleaning up... Done
Done! Made 15 calls, taking 0.19s
Approximate latency, 0.01s
Tests passed successfully

The results are returned in dictionary format and can be queried here, or from the URL itself:

test_connection creates a ConnectionTest object and runs the contained tests. Within, we test the minimal required functionality for running jobs:

  1. connection to the remote

  2. creation of a file in at least the home dir

  3. functional transport system

Provided these three conditions are true, the test will evaluate as passed.

[4]:
connection.connection_test.passed
[4]:
True

Some extra parameters are checked, and stored within the data (which stores useful info), and extra (which stores errors and minor details). You can query these if you are interested in their content.

The test also runs some basic timing checks and calculates a very rough latency:

[5]:
print(connection.connection_test.latency)

print(connection.latency)  # this is also available from the root URL object
0.012922207514444986
0.012922207514444986

Ping

Added in version 0.5.9.

URL also provides a ping() method, which will attempt to run the ping command on your system, targeting the remote. This takes the arguments n, waiting for n returns from the remote (defaults to 5), and timeout, which limits the total duration (defaults to 30s).

This method will return the delay in ms as a float.

URL.cmd

URL is a powerful interface between python and your remote system, the method that will likely be most used in your workflows is URL.cmd().

It has been mentioned occasionally in earlier tutorials, so here we will go through some of the more specialised features.

[6]:
connection.cmd('echo "this command is executed on the remote"')
[6]:
this command is executed on the remote

Internally URL creates a CMD object, and then executes the command, adding the appropriate ssh in order to operate over the network. We will see later how this can be fine tuned.

Error handling

By default, cmd will raise any errors encountered. If cmd detects anything on stderr (that isn’t an empty string), it will be raised as a RuntimeError:

[7]:
connection.cmd('do a thing')
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[7], line 1
----> 1 connection.cmd('do a thing')

File ~/Work/Devel/remotemanager/remotemanager/connection/url.py:729, in URL.cmd(self, cmd, asynchronous, local, stdout, stderr, timeout, max_timeouts, raise_errors, dry_run, prepend, force_file, landing_dir, stream, verbose)
    726 if dry_run:
    727     return thiscmd
--> 729 thiscmd.exec()
    730 if not local:
    731     self._callcount += 1

File ~/Work/Devel/remotemanager/remotemanager/connection/cmd.py:369, in CMD.exec(self, verbose)
    366     return self._fexec(stdout, stderr, verbose)
    368 try:
--> 369     self._exec(stdout, stderr, verbose)
    370 except OSError as E:
    371     msg = "Encountered an OSError on exec, attempting file exec"

File ~/Work/Devel/remotemanager/remotemanager/connection/cmd.py:469, in CMD._exec(self, stdout, stderr, verbose)
    467 if not self._async and not self.is_redirected:
    468     logger.debug("in-exec communication triggered")
--> 469     self.communicate(verbose=verbose)

File ~/Work/Devel/remotemanager/remotemanager/connection/cmd.py:600, in CMD.communicate(self, use_cache, ignore_errors, verbose)
    598         logger.warning("locale error detected: %s", err)
    599     else:
--> 600         raise RuntimeError(f"received the following stderr: \n{err}")
    602 self._stdout = _clean_output(std)
    603 self._stderr = _clean_output(err)

RuntimeError: received the following stderr:
/bin/bash: -c: line 1: syntax error near unexpected token `do'
/bin/bash: -c: line 1: `do a thing'

Spurious Errors

Some systems can place non-critical warnings onto stderr, which can cause otherwise perfectly functional workflows to think they have failed. If this is the case, you can use raise_errors=False.

Note

If you have a situation where a machine is raising non-fatal errors, raise_errors=False can be passed to the actual URL, which sets and cmd call to ignore errors by default.

The CMD Object

URL.cmd also returns a CMD object, which can be stored and queried.

The most useful properties are stdout and stderr which allow access to these attributes after a call.

Note

If a direct CMD call is important, it is advisable to capture it within a variable (such as below). URL does keep a history, but it is limited in size.

[8]:
output = connection.cmd('do a thing', raise_errors=False)
[9]:
print('cmd stdout:', output.stdout)
print('cmd stderr:', output.stderr)
cmd stdout:
cmd stderr: /bin/bash: -c: line 1: syntax error near unexpected token `do'
/bin/bash: -c: line 1: `do a thing'

There are other useful attributes attached to this object, which may assist in your workflows, or debugging:

[10]:
print('Shell process id is:', output.pid)
print('Working dir of the call is:', output.pwd)
print('The command that was sent is:', output.sent)
print('User that executed the cmd:', output.whoami)
print('You also have access to the returned code:', output.returncode)
Shell process id is: 42294
Working dir of the call is: /home/ljbeal/Work/Devel/remotemanager/docs/source/tutorials
The command that was sent is: do a thing
User that executed the cmd: ljbeal
You also have access to the returned code: 2

You can also use the output.kill() method, which will attempt to terminate the process.

CMD History

Added in version 0.6.1.

URL captures your most recent cmd calls within a cmd_history property.

This has a fixed length set by url.cmd_history_depth (defaults to 10).

This is useful for debugging unexpected results from a call which was not captured within a variable.

[11]:
print(connection.cmd('pwd'))
print(connection.cmd_history[-1])
/home/ljbeal/Work/Devel/remotemanager/docs/source/tutorials
/home/ljbeal/Work/Devel/remotemanager/docs/source/tutorials
[12]:
print(type(connection.cmd_history[-1]))
<class 'remotemanager.connection.cmd.CMD'>

Async calls

Up until now we have been calling commands sequentially and waiting for the result, however it’s possible to launch a command and proceed without waiting.

Below we have 2 structures that issue a command that waits for 3s, then returns the string “finished!”

We will time how long the execution takes, and how long it takes to get back the result:

[14]:
import time

t0 = time.time()
output1 = connection.cmd('sleep 3 && echo "finished!"')

t1 = time.time()
dt = int(round(t1 - t0))
print(f'call took ~{dt}s')

print(output1)
t2 = time.time()

dt = int(round(t2 - t1))
print(f'collecting the results took ~{dt}s')
call took ~3s
finished!
collecting the results took ~0s
[15]:
t0 = time.time()
output2 = connection.cmd('sleep 3 && echo "finished!"', asynchronous=True)

t1 = time.time()
dt = int(round(t1 - t0))
print(f'call took ~{dt}s')

print(output2)
t2 = time.time()

dt = int(round(t2 - t1))
print(f'collecting the results took ~{dt}s')
call took ~0s
finished!
collecting the results took ~3s

As we can see, the first call waits for completion, returning the result. The second call however skips this waiting phase, and we don’t actually have to wait for the command to execute until after we request the result

Fine Tuning a cmd Call

URL has some further options available which may enhance your workflows in a remote setting. We have already seen the asynchronous argument, lets look at a few more.

To show these more in depth systems, we shall create a “dummy” connection:

[16]:
dummy = URL(user='username', host='remote.connection.address')

Dry Run

If you’re about to issue a command which could be potentially destructive (or time intensive), it is wise to check that it actually looks sensible.

dry_run does just this. Instead of executing the command on the remote, it will simply return what it would excecute as a string.

[17]:
dummy.cmd('echo "this call will just be returned as a string"', dry_run=True)
[17]:
ssh -p 22 -q username@remote.connection.address 'echo "this call will just be returned as a string"'

Local

A useful flag which you may need to use, is local.

This allows you to run commands on your local machine, even using a URL that is pointed at a remote. See the change in command here:

[18]:
dummy.cmd('echo "this call will just be returned as a string"', local=True, dry_run=True)
[18]:
echo "this call will just be returned as a string"

As this command skips over the remote portion, we don’t actually need the dry_run here.

[19]:
dummy.cmd('echo "this call will just be returned as a string"', local=True)
[19]:
this call will just be returned as a string

Forcing a file-type execution

The internal CMD object has a special run-mode where it will first dump the cmd to a file, then execute that file with bash.

Normally this is used as a backup for a situation where the cmd can fail to execute. However you can force this behaviour by passing force_file=True.

Once this cmd is communicated with, the file will be cleared from the system, so we need to use asynchronous=True to prevent this.

[20]:
file_cmd = dummy.cmd('echo "this call will just be returned as a string"', local=True, force_file=True, asynchronous=True)

The filename is stored temporarily in the redirect attribute of the resulting CMD object.

[21]:
tempfile = file_cmd.redirect["execfile"]

print(tempfile)
221971b2.sh
[22]:
with open(tempfile) as o:
    print(o.read())
echo "this call will just be returned as a string"

Now if we access the result of the call, it will attempt to communicate with the process, removing the file as it does so.

[23]:
print(file_cmd.stdout)
this call will just be returned as a string

Global CMD Parameters

The remaining options can be set at the URL level (not just on the cmd() call), so we’ll demonstrate them there.

Options set this way then apply to all cmd calls issued by that URL. (Though any args passed to the cmd() method will override them)

Timeout Parameters

Each call to the remote will attempt to gracefully handle a timeout. In the case of a slow connection, a timeout will occur after timeout seconds. The operation of this is as follows:

  1. If a connection takes longer than timeout seconds to respond, it will issue an internal timeout error.

  2. CMD will then wait for timeout seconds, then retry.

  3. If the attempt fails again, CMD will wait for n*timeout and repeat, where n is the number of current failures + 1.

  4. This continues until max_timeouts is reached, when a RuntimeError will be raised instead.

timeout defaults to 5s and max_timeouts defaults to 3 attempts

Note

This occurs on the communicate side of a CMD exec, so an asynchronous call will not see this until you try to access the output (or trigger communicate in another way).

[24]:
dummy = URL(user='username', host='remote.connection.address', timeout=10, max_timeouts=5)
[25]:
print(dummy.timeout)
10
[26]:
print(dummy.max_timeouts)
5

Added in version 0.13.4.

Note

You can now disable the timeout function by setting timeout to 0, a negative number or False.

Landing Directory

Added in version 0.9.19.

By default, a URL.cmd will “land” in the default directory that a standard ssh would. This can be configured via the landing_dir argument.

To demonstrate this, we will have to hop back over to a functional URL.

(It will also be helpful to create a directory to land in using our main URL, connection.

[27]:
connection.cmd("mkdir -p inner_directory")

print("initial landing dir:", connection.cmd('pwd'))
print("updated landing dir:", connection.cmd('pwd', landing_dir="inner_directory"))
initial landing dir: /home/ljbeal/Work/Devel/remotemanager/docs/source/tutorials
updated landing dir: /home/ljbeal/Work/Devel/remotemanager/docs/source/tutorials/inner_directory

Now lets do this at the URL level.

[28]:
url = URL(landing_dir="inner_directory")

url.cmd("pwd")
[28]:
/home/ljbeal/Work/Devel/remotemanager/docs/source/tutorials/inner_directory

Editing the ssh string

URL has an ssh property which will return the string that allows interfacing with the remote. If this needs updating for whatever reason it can be overridden by simply setting the attribute. The example below is used to remove a locale error that can occur on some systems.

Added in version 0.6.0: This specific update is no longer needed, as the locale errors are ignored by default. However the functionality of modifying your ssh remains.

[29]:
print('initial ssh string:', dummy.ssh)

dummy.ssh = 'LANG=C ' + dummy.ssh

print('updated ssh string:', dummy.ssh)
initial ssh string: ssh -p 22 -q username@remote.connection.address
updated ssh string: LANG=C ssh -p 22 -q username@remote.connection.address

To undo this change, set ssh to None, or call url.clear_ssh_override()

[30]:
dummy.ssh = None
dummy.clear_ssh_override()
print('the reverted ssh string is', dummy.ssh)
the reverted ssh string is ssh -p 22 -q username@remote.connection.address

URL.utils

URL also provides a utils module which provides both commonly used functions, and more complex ones. First of these is the mkdir and touch methods, which will create a dir and file with the given path, respectively

[31]:
test_mtime = int(time.time())

connection.utils.mkdir('temp_utils_test')
connection.utils.touch('temp_utils_test/create_me')
connection.utils.touch('temp_utils_test/create_me_also')
[31]:

There is also utils.ls, which returns the files as a list by default

[32]:
connection.utils.ls('temp_utils_test')
[32]:
['create_me', 'create_me_also']

The more powerful functions granted by utils is the search_folder, file_presence and file_mtime methods.

These methods allow searching for a list of files, condensing the the query down to a single call in each case. This is useful for high latency remote systems, where an ls search for 100+ files could take a long time. These functions will do this in a single call.

search_folder takes a list of files and a folder, returning a {file: bool} “truth-dict” of whether those files are present

[33]:
connection.utils.search_folder(['create_me', 'not_present'], 'temp_utils_test')
[33]:
{'create_me': True, 'not_present': False}

Similarly, a more general form exists in file_presence, which will take a list of files and return a similar truth-dict of their ls presence

[34]:
connection.utils.file_presence(['temp_utils_test/create_me',
                                'missing_folder/file',
                                'temp_utils_test/not_present'])
[34]:
{'temp_utils_test/create_me': True,
 'missing_folder/file': False,
 'temp_utils_test/not_present': False}

If the file modification time is what you want, then file_mtime can be used in a similar way. This is the method called internally in file_presence, so incurrs no extra runtime

[35]:
times = connection.utils.file_mtime(['temp_utils_test/create_me',
                                     'temp_utils_test/create_me_also',
                                     'temp_utils_test/not_present'])

print(times)
{'temp_utils_test/create_me': 1738078661, 'temp_utils_test/create_me_also': 1738078661, 'temp_utils_test/not_present': None}

Tunnels

ssh tunnels allow a peristent connection to a machine. You could create a tunnel to a machine hosting a jupyter instance to access it locally, for example.

Lets demonstrate what that looks like.

[37]:
remote_url = URL("remote.host")

remote_url.cmd("jupyter lab --ip=0.0.0.0", dry_run=True)
[37]:
ssh -p 22 -q remote.host 'jupyter lab --ip=0.0.0.0'

Note

The --ip=0.0.0.0 modifier is required to allow external connections.

This would start a jupyter lab session with the base python. Note that this is a simplified example, in your case you will most likely need to follow a different procedure to start jupyter.

In any case, lets assume that the server is running on port 8888 (the default) on remote.host.

Now we can create a tunnel to it. If we have (or want) any locally run servers, we now cannot reuse port 8888. So lets redirect to 9999.

[38]:
tunnel = remote_url.tunnel(local_port=9999, remote_port=8888, background=True, dry_run=True)

print(tunnel.cmd)
ssh -p 22 -q remote.host -N -L :9999:remote.host:8888 remote.host

This command creates a tunnel between your machine and the remote. The jupyter server will now be available at 127.0.0.1:9999

The local ip address can be changed by setting local_address.

Important

remotemanager will attempt to avoid leaving “dangling” tunnels open. To help with this, the PID of a non dry_run tunnel will be reported. However assigning the tunnel to a variable is advised, allowing you to call the kill() method.

Note

The with context (below) is the preferred method of handling tunnels, if it is possible for your use case. This will handle the safe closure of the tunnel on your behalf, even in the case of an exception.

The with context

It is possible to execute commands using the python with context. This ensures that the process is properly killed if an exception occurs.

Note

This can be used to ensure that your tunnels are closed if you’re using them within a script.

We can demonstrate this by generating a long running async command and then causing a failure within the context.

[39]:
t0 = time.time()

with url.cmd("sleep 300 && echo 'foo'", asynchronous=True) as c:
    print(f"my PID is {c.pid}")
    raise RuntimeError("Raise an exception here to force the with(...) to exit")
my PID is 42323
Exiting context, killing pid 42323
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[39], line 5
      3 with url.cmd("sleep 300 && echo 'foo'", asynchronous=True) as c:
      4     print(f"my PID is {c.pid}")
----> 5     raise RuntimeError("Raise an exception here to force the with(...) to exit")

RuntimeError: Raise an exception here to force the with(...) to exit
[40]:
print(f"dt: {time.time() - t0:.2f}s")
dt: 0.02s

Note that the time to execute is significantly shorter than the sleep. We can check that the process no longer exists by querying the pid:

[41]:
import psutil
psutil.pid_exists(c.pid)
[41]:
False

ProxyJump

If you can connect with an ssh ... command, you can do it using URL. An extension to this is ProxyJump, which allows you to “hop” between hosts to get to your destination. If you have this set up, you will have an ssh config file that looks somewhat like this:

Host remote-endpoint
    User username
    Hostname remote.endpoint.address
    ProxyJump remote-middleman
Host remote-middleman
    User username
    Hostname remote.middleman.address

The following URL is an example that would connect using these parameters:

[42]:
proxyurl = URL(host='remote-endpoint')

print(proxyurl.userhost)
remote-endpoint
[43]:
print(proxyurl.cmd('echo "test"', dry_run=True))
ssh -p 22 -q remote-endpoint 'echo "test"'

Setting a Default URL

The standard default URL is one pointed at localhost, this is in reality a safety measure to ensure that the Dataset at least has a URL. However this is not ideal for a remote workflow.

The default_url is a property of Dataset, and can be set at the object level after importing.

[44]:
from remotemanager import Dataset

def func(inp):
    return inp

ds = Dataset(func, skip=False)

print(ds.url.userhost)
localhost
[45]:
url = URL("user@host")

Dataset.default_url = url
[46]:
ds = Dataset(func, skip=False)

print(ds.url.userhost)
user@host

I’m seeing errors that look like system messages

Added in version 0.10.15.

It is normal for machines to output information on connection. Usage, documentation, disk quotas, etc. Sometimes, however sometimes these messages can be emitted on stderr, rather than stdout. remotemanager will see this and assume that something has gone wrong. To prevent this, by default all ssh calls use the -q flag.

ssh -q flag

This flag suppresses most errors and warnings, and should allow for smoother control of your machine. If you’re seeing strange behaviour with errors not being properly collected, you can either set

url.quiet_ssh = False, or initialise URL with URL(..., quiet_ssh = False)